14 research outputs found

    ADAPTIVE UNMANNED VEHICLE AUTOPILOTING USING WEBRTC VIDEO ANALYSIS

    Get PDF
    Εκμεταλλευόμαστε τις νέες δυνατότητες που παρέχονται από το WebRTC, υπό την έννοια της διαλειτουργικότητας και των τελευταίας γενιάς επικοινωνιών σε πραγματικό χρόνο, προκειμένου να αναπτύξουμε ένα σύστημα για το πιλοτάρισμα μη επανδρωμένων οχημάτων χρησιμοποιώντας ανάλυση βίντεο. Συγκεκριμένα, ορίζουμε μια τοπολογία όπου ένα ROS όχημα μεταδίδει βίντεο μέσω WebRTC προς έναν ενδιάμεσο εξυπηρετητή, ο οποίος με τη σειρά του το μεταβιβάζει σε έναν πελάτη. Ο εξυπηρετητής εκμεταλλεύεται τη βιβλιοθήκη OpenCV και εφαρμόζει ανάλυση βίντεο, με τέτοιο τρόπο ώστε να εξυπηρετήσει ένα επιλεγμένο από τον πελάτη σενάριο. Οι αντίστοιχες εντολές μεταδίδονται στο όχημα, με αποτέλεσμα να έχουμε ένα αυτόματα οδηγούμενο όχημα. Ο πελάτης παρακολουθεί την πορεία του οχήματος και μπορεί να αλλάξει δυναμικά το επιλεγμένο σενάριο – αυτό σημαίνει είτε να αλλάξει ελαφρώς τη λειτουργία του (π.χ. από παρακολούθηση ανθρώπων σε παρακολούθηση παιδιών) είτε να ενεργοποιήσει μια εντελώς διαφορετική φιλοσοφία λειτουργίας – στέλνοντας τα κατάλληλα αιτήματα στον εξυπηρετητή. Μόλις ο εξυπηρετητής λάβει αυτά τα αιτήματα, χρησιμοποιεί τις αντίστοιχες λειτουργίες το OpenCV για να εξυπηρετήσει το νέο σενάριο, και στέλνει τις νέες εντολές οδήγησης στο όχημα, αναγκάζοντας το σύστημα να υιοθετήσει μια νέα λειτουργία αυτόματου πιλότου. Η επικοινωνία μεταξύ του οχήματος, του εξυπηρετητή και του πελάτη εδραιώνεται μέσω των SIP/SDP και ενορχηστρώνεται μέσω ενός Web-Socket εξυπηρετητή που επιτελεί το ρόλο του Signaling Server, ενώ οι εντολές μεταφέρονται μέσω του WebRTC Data Channel πάνω από το SCTP. Περιγράφουμε και αναλύουμε το πώς όλα αυτά τα ετερογενή συστατικά (WebRTC – OpenCV – ROS) συνδυάζονται για τη δημιουργία μιας δικτυακής υποδομής, για το αυτόματο πιλοτάρισμα ROS οχημάτων σύμφωνα με ένα συγκεκριμένο σενάριο χρήσης. Τέλος, τα αποτελέσματα αποδεικνύουν την ιδέα μας, δηλαδή μια οριζόντια υποδομή που (α) αποτελείται από μια ευέλικτη/αρθρωτή αρχιτεκτονική, (β) παρέχει τα απαραίτητα στοιχεία για την μηχανή-σε-μηχανή επικοινωνία, (γ) χρησιμοποιεί τελευταίας γενιάς τεχνολογίες, (δ) επιτρέπει σε έναν προγραμματιστή να εφαρμόσει τη δική του λογική κατακόρυφα σε βάθος και (ε) παρέχει στον τομέα του IoT μια λύση που μπορεί εύκολα να αξιοποιηθεί με πολλούς τρόπους.We exploit the new features provided by WebRTC in terms of interoperability and state-of-the-art real-time communications, in order to develop a system for piloting unmanned vehicles using video analysis. Specifically, we define a topology where a ROS-based vehicle transmits its video using WebRTC to an intermediate server, who in turn relays it to a client. The server takes advantage of the OpenCV library and applies video analysis, with respect to a selected task (i.e. face detection) defined by the client. The corresponding commands are transmitted to the vehicle, resulting in an automatically driven unmanned vehicle. The client monitors the vehicle’s movement and can dynamically change the selected use case; that is, either change slightly its operation (i.e. from human tracking to children tracking) or enable an entirely new core philosophy (i.e. to fire detection) by sending the appropriate requests to the server. Upon reception of these requests, the server utilizes the corresponding OpenCV functionalities to serve the new task, and sends the new piloting commands to the vehicle, forcing the system to adopt a new autopiloting mode. This communication between the vehicle, the server and the client is established using SIP/SDP and orchestrated via a WebSocket server that serves as a Signaling Server, the media are transferred through SRTP/UDP, and the commands are carried via the WebRTC Data Channel over SCTP. We explain and describe how to combine all of these heterogeneous components (WebRTC – OpenCV – ROS), in order to compose a web-based infrastructure for autopiloting ROS-based vehicles upon a specific use case. Finally, the results prove our concept, meaning a horizontal infrastructure that (a) consists of a modular architecture, (b) provides the necessary components for machine-to-machine communication, (c) uses state-of-the-art technologies, (d) allows a developer to implement her own logic vertically, and (e) provides IoT with a solution that can be easily exploited in numerous ways

    Improving filling level classification with adversarial training

    Full text link
    We investigate the problem of classifying - from a single image - the level of content in a cup or a drinking glass. This problem is made challenging by several ambiguities caused by transparencies, shape variations and partial occlusions, and by the availability of only small training datasets. In this paper, we tackle this problem with an appropriate strategy for transfer learning. Specifically, we use adversarial training in a generic source dataset and then refine the training with a task-specific dataset. We also discuss and experimentally evaluate several training strategies and their combination on a range of container types of the CORSMAL Containers Manipulation dataset. We show that transfer learning with adversarial training in the source domain consistently improves the classification accuracy on the test set and limits the overfitting of the classifier to specific features of the training data.Comment: Accepted to the 28th IEEE International Conference on Image Processing (ICIP) 202

    SparseFool: a few pixels make a big difference

    No full text
    Deep Neural Networks have achieved extraordinary results on image classification tasks, but have been shown to be vulnerable to attacks with carefully crafted perturbations of the input data. Although most attacks usually change values of many image's pixels, it has been shown that deep networks are also vulnerable to sparse alterations of the input. However, no \textit{efficient} method has been proposed to compute sparse perturbations. In this paper, we exploit the low mean curvature of the decision boundary, and propose SparseFool, a geometry inspired sparse attack that controls the sparsity of the perturbations. Extensive evaluations show that our approach outperforms related methods, and scales to high dimensional data. We further analyze the transferability and the visual effects of the perturbations, and show the existence of shared semantic information across the images and the networks. Finally, we show that adversarial training using \ell_\infty perturbations can slightly improve the robustness against sparse additive perturbations

    SparseFool: a few pixels make a big difference

    No full text
    Deep Neural Networks have achieved extraordinary results on image classification tasks, but have been shown to be vulnerable to attacks with carefully crafted perturbations of the input data. Although most attacks usually change values of many image's pixels, it has been shown that deep networks are also vulnerable to sparse alterations of the input. However, no computationally efficient method has been proposed to compute sparse perturbations. In this paper, we exploit the low mean curvature of the decision boundary, and propose SparseFool, a geometry inspired sparse attack that controls the sparsity of the perturbations. Extensive evaluations show that our approach computes sparse perturbations very fast, and scales efficiently to high dimensional data. We further analyze the transferability and the visual effects of the perturbations, and show the existence of shared semantic information across the images and the networks. Finally, we show that adversarial training can only slightly improve the robustness against sparse additive perturbations computed with SparseFool

    Hold me tight! Influence of discriminative features on deep network boundaries

    No full text
    Important insights towards the explainability of neural networks reside in the characteristics of their decision boundaries. In this work, we borrow tools from the field of adversarial robustness, and propose a new perspective that relates dataset features to the distance of samples to the decision boundary. This enables us to carefully tweak the position of the training samples and measure the induced changes on the boundaries of CNNs trained on large-scale vision datasets. We use this framework to reveal some intriguing properties of CNNs. Specifically, we rigorously confirm that neural networks exhibit a high invariance to non-discriminative features, and show that very small perturbations of the training samples in certain directions can lead to sudden invariances in the orthogonal ones. This is precisely the mechanism that adversarial training uses to achieve robustness
    corecore